The Reflective Functioning Questionnaire–Revised– 7 (RFQ-R-7): A new measurement model assessing hypomentalization

Although it is a widely used questionnaire, limitations regarding the scoring procedure and the structural validity of the eight-item Reflective Functioning Questionnaire (RFQ-8) were raised. The present study aimed to examine further the latent dimensionality of the RFQ-8 and to examine linear and non-linear associations between mentalization difficulties and maladaptive psychological characteristics. Data from two separate representative samples of young adults (N = 3890; females: 51.68%; mean age: 27.06 years [SD = 4.76]) and adults (N = 1385; females: 53.20%; mean age: 41.77 years [SD = 13.08]) were used. In addition to the RFQ-8, standardized questionnaires measured the levels of impulsivity, sensation seeking, rumination, worry and well-being. Confirmatory factor analysis (CFA) was used to test the model fit of competing measurement models. CFA revealed that a revised, seven-item version of the RFQ (RFQ-R-7) with a unidimensional structure showed the most optimal levels of model fit in both samples. Impulsivity, sensation seeking, rumination and worry consistently presented significant, positive, linear associations with general mentalization difficulties in both samples. Significant quadratic associations were also identified, but these relationships closely followed the linear associations between the variables and increased only marginally the explained variance. The supported unidimensional measurement model and the associations between the general mentalization difficulties factor and maladaptive psychological characteristics indicated that the RFQ-R-7 captures a dimension of hypomentalization ranging between low and high levels of uncertainty. Increasing levels of hypomentalization can indicate a risk for less adaptive psychological functioning. Further revisions of the RFQ-8 might be warranted in the future to ensure adequate measurement for hypermentalization.


Introduction
Mentalization or reflective functioning (used as synonymous terms in the present article) indicates one's ability"to reflect on internal mental states such as feelings, wishes, goals, and self and others (i.e., being aware of not having complete understanding and comprehension of mental states), and therefore contains a moderate level of uncertainty. Genuine mentalizing is thus characterized by both some extent of certainty and uncertainty while individuals consider having a decent knowledge about their mental states [1]. Individuals in the RFQ-8 provide responses for each item on a seven-point scale between "strongly disagree" and "strongly agree" response options. However, in order to calculate the subscale scores, the items of the RFQ must be recoded. For example, it was assumed that on the item of "I don't always know why I do what I do" low scores (response options 1, 2 and 3) indicate hypermentalization (certainty), response option in the middle (response option 4) implicates genuine reflective functioning, whereas high scores indicate hypomentalization (uncertainty, response options 5, 6 and 7). That is, only high and/or low scores on a particular item are considered to calculate each subscale score, and responses on other categories are weighted by 0 point (e.g., the scoring procedure of the aforementioned item for hypermentalization [certainty] subscale: 3, 2, 1, 0, 0, 0, 0; for hypomentalization [uncertainty] subscale: 0, 0, 0, 0, 1, 2, 3). Overall, 6-6 variables form the two subscales and responses on four items of the RFQ-8 are simultaneously considered on both subscales. By using this scoring procedure, various previous studies confirmed the two-factor structure of the RFQ-8 in clinical and non-clinical samples [1,[18][19][20][21].
However, recent studies highlighted that the items and the scoring procedure of the RFQ-8 can bias the measurement of reflective functioning [11,22]. Namely, items of the RFQ-8 might capture the construct of reflective functioning difficulties insufficiently. For example, hypermentalization (certainty) is only measured by a tendency to disagree with uncertainty, externally oriented and non-behavior-related mentalization capacities might be underrepresented by the items, and multiple items refer on negative urgency-related features. Moreover, as responses on four items are simultaneously weighted on both subscales, the recoding procedure contributes to have redundant variables and subscale scores which are non-independent in confirmatory factor analysis (CFA) [11,22]. Therefore, it was suggested that the structural validity of the RFQ-8 should be examined based on the items with the original response scale. Using this approach, a unidimensional structure of the RFQ was confirmed. Due to the insufficient measurement of hypermentalization (certainty) in the scale, it might be possible that a unidimensional structure of the RFQ-8 can indicate a continuum between low and high levels of hypomentalization (uncertainty) [11,22].
The present study had two main aims. First, to examine the latent structure of the RFQ-8 based on the originally scored items by testing the model fit of competing measurement models in two separate representative samples of adults and young adults. To the best of the Authors' knowledge, only two studies examined so far the latent structure of the RFQ-8 based on the items with original response scale [11,22]. Except for one research [22], previous studies predominantly analyzed the latent structure of the RFQ-8 based on non-representative samples which can limit the generalizability of the findings. Moreover, there was no attempt in the literature to examine the reproducibility of any factor structure in separate representative samples. Therefore, it was expected that the present study can contribute to enhance the understanding on the latent structure of the RFQ-8 in the general, non-clinical population. Assessing the dimensionality of the RFQ-8 specifically among young adults is important as individuals in the age group can show an increased risk for mental health problems [23].
Second, to examine linear and non-linear associations between mentalization difficulties and maladaptive psychological characteristics, including rumination, worry (both considered as maladaptive emotion regulation forms) [24,25], impulsivity and sensation seeking (both considered as transdiagnostic personality dimensions for psychiatric disorders) [26] and lack of well-being (as an indirect indicator of depressive symptoms) [27]. According to the original, bipolar continuum conceptualization of mentalization difficulties by Fonagy et al. [1], a nonlinear relationship can be hypothesized with maladaptive psychological outcomes (e.g., those with high hyper-and hypomentalization can show an increased risk for maladaptive psychological characteristics) [11]. On the other hand, based on the assumed unidimensional structure of the RFQ-8, it is also possible that a linear relationship describes best the relationship with maladaptive psychological outcomes (e.g., as the level of hypomentalization increases, the levels of maladaptive outcomes also increases) [11]. Therefore, by examining simultaneously both linear and quadratic (non-linear) effects of maladaptive psychological characteristics on mentalization difficulties, it is possible to gain further knowledge on how mentalization difficulties are linked to maladaptive psychological outcomes.

Participants and procedures
The present cross-sectional study used data from two separate representative samples. On the one hand, the sample from the first data collection round (between March and July 2019) of the Budapest Longitudinal Study (BLS) (https://www.bls2018.hu/main/en) was considered. The target population in the BLS was the young adult (aged between 18-34 years) population from Budapest, the capital city of Hungary. Stratified and random sampling procedure contributed to a representative sample in terms of age and district of residence (net sample size: N = 3890; proportion of females: 51.68% [N = 2010]; mean age: 27.06 years [SD = 4.76]). On the other hand, data (collected between March and April 2019) from the National Survey on Addiction Problems in Hungary 2019 (NSAPH 2019) were also considered in the present study [28]. The NSAPH 2019 used a nationally representative sample of the Hungarian adult population aged between 18-64 years. Stratified and random sampling procedure assured sample representativeness in terms of age, geographic regions and size of residence ( [1]. Individuals provided responses for each item on a seven-point scale (1 = Do not agree at all, 7 = Agree completely). Further characteristics of the scale are described in the Introduction. Previous studies have not yet tested the psychometric properties of the Hungarian version of the RFQ-8, this was first done in the present study.

Barratt Impulsiveness Scale (BIS).
The level of trait impulsivity (e.g., cognitive and behavioral impulsivity, impatience/restlessness) was measured by a shortened, 10-item version of the revised, 21-item BIS (BIS-R-21) [29,30]. Respondents evaluated the items of the questionnaire on a four-point scale (1 = Rarely never/Never, 4 = Almost always/Always). A threefactor structure (including factors of cognitive and behavioral impulsivity, impatience/restlessness) was confirmed in Hungarian samples during the development of the BIS-R-21, with acceptable levels of internal reliability, and the validity of the impulsivity factors was also supported via their associations with externalizing characteristics and psychological distress [29]. Total scale score was calculated to represent the overall level of impulsivity. Adequate levels of internal consistency were presented in both samples (Young adults: α = 0.78; Adults: α = 0.82).

Brief Sensation Seeking Scale (BSSS).
The 8-item BSSS was used to measure the respondents' sensation seeking tendencies (e.g., thrill-, adventure-and experience seeking, boredom susceptibility, disinhibition) [31,32]. Each item was assessed on a five-point scale (1 = Strongly disagree, 5 = Strongly agree), which allowed to calculate the total score on the scale. In the Hungarian adaptation of the BSSS, the single-factor structure was characterized by adequate fit and internal consistency, while the validity of the scale was supported by its relationships with smoking and smoking expectancies [32]. High rates of internal consistency were shown in both samples (Young adults: α = 0.83; Adults: α = 0.83).

Ruminative Response Scale (RRS).
The 10-item RRS was used to explore the level of ruminative tendencies (e.g., brooding, reflection) in response to negative affectivity [33,34]. Responses were provided for all items on a four-point scale (1 = Never, 4 = Always). Previous research using the Hungarian version of the 10-item RRS has confirmed the scale's two-factor structure (including brooding and reflection factors) with adequate internal consistency levels, and the validity of the brooding and reflection subscales has also supported based on their associations with psychological distress [33,35]. Due to the high correlations between the brooding and reflection subscales (r�0.76), the overall level of ruminative thinking was considered based on the total scale score. Very high levels of internal consistency were demonstrated in both samples (Young adults: α = 0.91; Adults: α = 0.92).

Penn State Worry Questionnaire (PSWQ).
The 3-item, ultra-brief version of the PSWQ was used to measure the tendency for excessive worrying as an emotion regulation strategy [36,37]. A total scale score was calculated based on the 3 items which were assessed on a five-point scale (1 = Not at all typical, 5 = Very typical). The Hungarian validation of the original 16-item PSWQ confirmed a structure with a general worry and two methodological factors, and the internal consistency and validity of the scale (e.g., through its associations with depression and anxiety) were also found to be adequate [37]. Optimal levels of internal consistency were shown in both samples (Young adults: α = 0.93; Adults: α = 0.91).
2.2.6. World Health Organization Well-Being Index (WHO WBI). The participants' general well-being level (as a possible indicator of the absence of depressive symptoms) was evaluated in the past one moths based on the 5-item WHO WBI (WHO WBI-5) [27,38,39]. The respondents evaluated the items of the scale on a four-point scale (0 = Was not typical, 3 = Was very typical) [39]. In the Hungarian adaptation of the WHO WBI-5, the unidimensional structure of the scale was supported with good internal consistency, and the validity of the scale was also found to be acceptable based on the correlations observed with depression, anxiety, hopelessness, life meaningfulness, and subjective health [39]. The total scale score was considered in the analyses. Decreased total scores on the WHO WBI-5 can indicate a risk for major depression [27]. High rates of internal consistency were presented in both samples (Young adults: α = 0.88; Adults: α = 0.89).

Data analysis
CFA was performed separately among young adult and adults to test the model fit of competing factor models of the RFQ-8. First, the two-factor model of the RFQ-8 was examined. The certainty and uncertainty factors were defined by 6-6 recoded variables with each using a four-point scale (between 0 and 3). These variables were considered as ordered categorical variables, thus the weighted least squares means and variance adjusted (WLSMV) estimation method was applied. However, as scores on some of the indicator items were dependent on each other (i.e., the polychoric correlations between these variables are -1) [11], measurement models of the RFQ-8 based on the items using the original, seven-point response scale were considered primarily.
Two competing measurement models were estimated and compared in terms of model fit, partially building on Müller et al.'s [11] approach: (i) a one-factor model (i.e., the 8 items of the RFQ-8 loaded on a general mentalization difficulties factor), and (ii) a two-factor model (i.e., following the original approach of Fonagy et al. [1], 6-6 items loaded on the correlating certainty and uncertainty factors, thus some of the variables loaded simultaneously on both factors). Each indicator variable in these models were considered as ordered categorical variables, therefore the WLSMV estimation method was applied.
Multiple model fit indices were considered to assess the level of model fit of each model. In the cases of the Comparative Fit Index (CFI) and the Tucker Lewis Index (TLI), values �0.900 indicate adequate model fit and values �0.950 show optimal model fit. For the Root Mean Square Error of Approximation (RMSEA) adequate model fit is shown by values �0.080 and optimal model fit is indicated by values �0.050. However, in accordance with literature recommendations, the models were not simply evaluated based on fit indices [40]. The fit indices can indicate a model's poor fit or inappropriate specification, but they are not suitable on their own to draw conclusions about the acceptability and applicability of a model. Therefore, the most optimal solution was selected not solely on the basis of the model fit indices, but also by interpreting the parameter estimates of a model [40]. Regarding the latter criterion, the models were evaluated in terms of the direction, magnitude and significance of the factor loadings and correlations, taking into account both statistical and theoretical aspects [40]. For statistical aspects, an important evaluation criterion was that the associations between the observed indicators and the latent factors should be meaningful. That is, it was expected that the factor loadings should be significant and at least moderately strong (i.e., λ�0.30). In addition, for CFA parameters, it was also taken into account that they should not show out-of-range values, which could indicate model misspecification or even a nonpositive definite matrix (e.g., with Haywood cases, correlations or loadings above 1.00) [40]. For the theoretical evaluation of parameter estimates, the direction of factor loadings and correlations was evaluated to ensure that they were in line with the a priori assumptions [40]. Finally, for the two-factor model, an important evaluation criterion was whether the discriminant validity of the two latent factors could be supported (e.g., factor correlations above 0.80 may indicate inadequate discriminant validity, with excessive overlap between factors, raising issues about their practical and statistical distinguishability) [40]. McDonald's omega reliability index (ω) was calculated as a measure of internal consistency for the best fitting model.
Next, a hierarchical, multiple indicators multiple causes (MIMIC) model was performed separately among young adults and adults to investigate the associations between the latent factor(s) of mentalization difficulties and maladaptive psychological characteristics. The outcome variable of the MIMIC model was the latent factor(s) of mentalization difficulties based on the best-fitting measurement model. The predictor variables were defined as observed variables and entered in two separate steps. First, the linear predictive effects of impulsivity, sensation seeking, rumination, worry and well-being were examined, while controlling for gender, age, level of education, economically active vs. inactive status (Model 1). Second, in addition to the abovementioned linear predictive effects, the quadratic terms related to impulsivity, sensation seeking, rumination, worry and well-being were also added to the model in order to investigate their non-linear (quadratic) associations with mentalization difficulties (Model 2). The predictor variables of impulsivity, sensation seeking, rumination, worry and well-being were standardized to avoid issues related to high multicollinearity (e.g., suppressor effects). The change in explained variance was also considered between the two steps of the MIMIC model, which could inform about the magnitude of the quadratic (non-linear) effects. The indicator variables of the latent factor(s) of mentalization difficulties were defined as ordered categorical variables in the MIMIC model, therefore the WLSMV estimation method was applied. Model fit of the MIMIC model was assessed based on the abovementioned criteria related to the CFI, the TLI and the RMSEA.
All analyses were performed by using Mplus 8.0 [41] statistical software. Sampling weights were considered during the analyses to ensure representativeness of the samples. Data and codes for analyses for this study are available at https://osf.io/7w2m3/.

Confirmatory factor analysis (CFA)
Model fit of the different measurement models of the RFQ-8 is summarized in Table 1. The original, two-factor model with recoded items showed optimal model fit among young adults and adequate model fit among adults based on the CFI and the TLI, whereas the RMSEA indicated insufficient model fit in both samples. Extremely high, negative correlations were shown between certainty and uncertainty factors (for factor loadings and correlations in this model, see: S1 Table).
The one-factor model based on the 8 items of the RFQ-8 with the original response scale presented adequate and optimal rates of model fit in both samples based on the CFI and the TLI, whereas the RMSEA indicated insufficient levels of model fit among both young adults and adults (for factor loadings in this model, see: Table 2). Except for the reversely coded Item 7 ('I always know what I feel'), items of the RFQ-8 showed moderate-high, positive and

PLOS ONE
The Reflective Functioning Questionnaire -Revised -7 (RFQ-R-7) significant factor loadings on the mentalization difficulties factor in both samples. Item 7 had significant, but marginal-small factor loadings on the mentalization difficulties factor in both samples (λ = 0.08-0.22). The general mentalization factor in the one-factor model of the RFQ-8 presented high internal consistency among both young adults and adults. By using the 8 items of the RFQ-8 with the original response scale, the two-factor model showed optimal levels of model fit based on the CFI and the TLI, and the RMSEA showed insufficient levels of model fit for the model among young adults as well as among adults (for factor loadings and correlation in this model, see: S2 Table). However, extremely high, positive factor correlations were presented in both samples (r�0.90). This excessive overlap between the two factors raised questions regarding their distinguishability and indicated their insufficient discriminant validity [40]. Regarding factor loadings, both statistical and theoretical problems were identified. Possibly biased, out-of-range factor loadings (λ>1.00) were presented in both samples (i.e., Items 4 and 6 among young adults and Item 6 among adults). In line with this, the latent variable covariance matrix was not positive definite for this model among adults due to a negative item residual variance. This may indicate the presence of Haywood cases and biased results, so the interpretation of these findings should be made with great caution and restrictions [40]. In addition, the interpretation of the two latent factors was theoretically difficult. Contrary to prior expectations, Items 4 ('When I get angry, I say things that I later regret') and 6 ('Sometimes I do things without really knowing why') presented negative factor loadings on the factors of Uncertainty and Certainty, respectively. This makes the content of the two factors ambiguous, as items that are related and overlapping in content presented both positive and negative loadings on the same factor on both samples (e.g., Item 6 presented a negative loading on the factor Certainty, while Item 2 [I don't always know why I do what I do] positively loaded on this factor). As with the one-factor model, Item 7 had significant but marginal-small factor loadings on the Uncertainty factor in both samples (λ = 0.08-0.22). Overall, although the fit indices were found to be better for the two-factor model (compared to the one-factor model), statistical and theoretical problems emerged in the evaluation of the parameter estimates, which raised questions regarding the structural and discriminant validity of the two-factor solution. Thus, this model was not considered further.
Due to the low factor loadings of Item 7 in all of the tested models with the items using the original response scale, modified measurement models were constructed by excluding Item 7. This version of the scale was named as Reflective Functioning Questionnaire-Revised-7

PLOS ONE
The Reflective Functioning Questionnaire -Revised -7 (RFQ-R-7) (RFQ-R-7) (see: S1 Appendix). Model fit of these models with 7 items using the original response scale is shown in Table 1. The one-factor model had optimal levels of model fit based on the CFI and the TLI, and insufficient model fit as per the RMSEA in both samples (for factor loadings in this model, see: Table 2). All factor loadings were positive, strong and significant among young adults. In the adult sample, Item 1 ('People's thoughts are a mystery to me') had a positive, moderately strong and significant loading on the latent factor, whereas the other items showed positive, strong and significant relationships with the general mentalization difficulties factor. The general mentalization factor in the one-factor model of the RFQ-R-7 presented high internal consistency in both samples. In both samples, the two-factor model presented optimal levels of model fit based on the CFI and the TLI, and insufficient levels of model fit based on the RMSEA. However, similar statistical issues were presented for the two-factor model of the RFQ-R-7 using the original response scale than its 8-item alternative (S2 Table). Extremely high positive factor correlations were presented in both samples (r�0.92). This excessive overlap between the two factors indicated insufficient discriminant validity for the Certainty and Uncertainty factors. Therefore, this may suggest that it may be more appropriate to combine the two factors and work with the more parsimonious single-factor solution [40]. Both statistical and theoretical problems were highlighted in the evaluation of the factor loadings. Possibly biased, out-of-range factor loadings (λ>1.00) were presented in both samples (i.e., Items 4 and 6 among young adults and Item 6 among adults). In line with this, the latent variable covariance matrix was not positive definite for this model among adults due to a negative item residual variance. This may indicate the presence of Haywood cases and biased results, so the interpretation of these findings should be made with great caution and restrictions [40]. Moreover, Items 2 and 4 showed non-significant factor loadings on the factors of Certainty and Uncertainty, respectively. In addition, the interpretation of the two latent factors was theoretically difficult. Contrary to prior expectations, Items 4 and 6 presented negative factor loadings on the factors of Uncertainty and Certainty, respectively. This makes the content of the two factors ambiguous, as items that are related and overlapping in content presented both positive and negative loadings on the same factor on both samples (e.g., Item 6 presented a negative loading on the factor Certainty, while Item 2 positively loaded on this factor). Overall, in the case of the RFQ-R-7, although the fit indices were found to be better for the two-factor model (compared to the one-factor model), statistical and theoretical problems emerged in the evaluation of the factor loadings and correlations, which raised questions regarding the structural and discriminant validity of the two-factor solution. Thus, this model was also not considered further.
Overall, the one-factor model of the RFQ-R-7 with a general mentalization difficulties factor was considered as the most optimal measurement model in both samples. The CFI and the TLI indicated high levels of model fit for this model in both samples, with moderate-high factor loadings and high internal consistencies. Although the RMSEA showed inadequate model fit among both young adults and adults, it is important to note that for models with low degrees of freedom (df, models of the RFQ-R-7 with originally scored items: df = 9-14) the RMSEA can show insufficient level of model fit-even with very high average factor loadings [42,43]. Therefore, it might be possible that the other fit indices (i.e., CFI, TLI) provided more accurate estimation of the model fit in the current analyses. Although the two-factor model of the RFQ-R-7 was characterized by closer fit to the data than the one-factor model of the RFQ-R-7, the interpretation of the parameter estimates in the two-factor model raised concerns regarding its acceptability and applicability. Among the tested models with the original response scale, the parameter estimates only showed sufficient characteristics in the one-factor model of the RFQ-R-7. For the other models, non-significant and/or low factor loadings were found (i.e., the one-factor model of the RFQ-8 in both samples, and the two-factor model of the RFQ-8 and the RFQ-R-7 in both samples), as well as Haywood cases (i.e., out-of-range factor loadings with λ>1.00) and a content ambiguous pattern of factor loadings (i.e., content similar items simultaneously showed positive and negative loadings on the same factor) were presented for the two-factor model of the RFQ-8 and the RFQ-R-7. The extremely low discriminant validity of the Certainty and Uncertainty factors in the two-factor model of the RFQ-8 and the RFQ-R-7 also suggested that it might be worth retaining a more parsimonious singe-factor solution [40]. Therefore, the one-factor model of the RFQ-R-7 was retained for the MIMIC model.

Multiple indicators multiple causes (MIMIC) model
Predictive effects in the MIMIC model in both samples are summarized in showed adequate and optimal levels of model fit. In the first step, higher levels of mentalization difficulties were significantly and positively associated with economically active status, impulsivity, sensation seeking, rumination, worry and well-being. However, the significant effects of economically active vs. inactive status and well-being had marginal effect sizes (β<0.10). Impulsivity showed strong association with the general mentalization difficulties factor, while the other significant relationships were weak. In the second step, the abovementioned significant (linear) predictive effects remained significant and had similar effect sizes than in Model 1. In addition to these, the quadratic terms related to impulsivity, sensation seeking, rumination, and worry were also identified as significant. These non-linear associations with general mentalization difficulties are shown in S1-S4 Figs. Overall, the quadratic relationships closely followed the linear associations

PLOS ONE
The Reflective Functioning Questionnaire -Revised -7 (RFQ-R- 7) between the variables. In line with this, only marginal increase (1%) was shown in the overall level of explained variance of the outcome between Models 1 and 2. The significant quadratic effects of impulsivity (S1 Fig In the first step, elevated levels of impulsivity (with moderate effect size), sensation seeking, rumination and worry (with weak effect sizes) significantly and positively predicted general mentalization difficulties. These significant (linear) relationships remained significant in the second step as well with similar effect sizes compared to Model 1. Moreover, there was a significant and negative (linear) relationship between well-being and mentalization difficulties, though with marginal effect size. The quadratic terms related to sensation seeking and worry were also significant. These non-linear associations with general mentalization difficulties are shown in S5 and S6 Figs. Overall, there were only minor differences between the linear and quadratic trend lines in these associations. In line with this, only marginal increase (2%) was shown in the overall level of explained variance of the outcome between Models 1 and 2. The significant quadratic effects of sensation seeking (S5 Fig)

Discussion
The present study aimed to examine further the latent dimensionality of the RFQ-8 in two separate representative samples of young adults and adults. CFA was used to test the model fit of competing measurement models of the RFQ-8. Considering the limitations of the recoding procedure regarding the calculation of the subscales of the RFQ-8, the analyses primarily examined measurement models with items of the RFQ-8 using the original, seven-point response scale. The present analyses revealed that Item 7 ('I always know what I feel') shows only weak associations with any factor of mentalization difficulties. Therefore, it was suggested that more precise measurement of mentalization difficulties can be obtained by using a revised, seven-item version of the brief RFQ, without Item 7 (named as the Reflective Functioning Questionnaire-Revised-7, RFQ-R-7) (see: S1 Appendix). A previous study using representative adult sample from Germany also reported insufficient psychometric properties for Item 7 [22]. This pattern might be explained by the difference in formulation for Item 7 compared to the other items of the RFQ-8. Higher agreement with this item refers to a certainty of understanding internal mental states, whereas stronger agreement with the other items of RFQ-8 can indicate tendencies of uncertainty of understanding internal mental states.
Overall, a one-factor model with a general mentalization difficulties factor showed the most optimal levels of model fit in both samples. In the case of the RFQ-R-7, each item had significant, moderate-strong and positive associations with the general mentalization difficulties factor. Moreover, the general factor was also characterized by high levels of internal consistency in both samples. These findings suggested that a unidimensional latent structure can describe most optimally the covariance between the items of the RFQ-R-7. Previous studies which performed CFA by using the items of the RFQ-8 with the original response scale also confirmed unidimensional structures for the scale [11,22]. Overall, the present findings might add to the current understanding on the latent structure of the scale by examining that in a new cultural context (i.e., among Hungarian individuals, previous studies with similar analytical approach included participants from Germany and the United States) and by confirming the replicability of the one-factor structure in two separate representative samples of adults and young adults.
However, it is important to note that for both the RFQ-8 and the RFQ-R-7, the two-factor model showed a better fit to the data than the single-factor model. However, the statistical (e.g., Haywood cases with out-of-range factor loadings, non-distinguishable factors due to the extremely high correlations between them) and theoretical problems (e.g., not straightforward factor content due to positive and negative loadings on the same factor by content similar items) associated with the parameter estimates in the two-factor models did not allow for the acceptance and applicability of these models. Future research would be needed to test the validity of these models on other samples or possibly suggest revisions to these models.
Next, linear and non-linear associations were examined between the mentalization difficulties factor of the RFQ-R-7 and maladaptive psychological characteristics. Impulsivity, sensation seeking, rumination and worry consistently presented significant, positive and at least weak linear associations with general mentalization difficulties in both samples of young adults and adults. Among these maladaptive psychological characteristics, impulsivity showed the strongest association with mentalizing difficulties, with strong and moderately strong associations among young adults and adults, respectively.
Both impulsivity and sensation seeking are externalizing personality traits which are considered as transdiagnostic risk factors for externalizing psychopathologies [26,[44][45][46]. Previous studies also reported positive correlations between impulsivity and mentalization difficulties [10,11,15], whereas limited data are available on the link between sensation seeking and difficulties in reflective functioning. The co-occurrence of high levels of impulsivity, sensation seeking and mentalization difficulties might be explained by the preference for disinhibition and lack of premeditation across these psychological constructs (e.g., the item of "Sometimes I do things without really knowing why" and "I don't always know why I do what I do" from the RFQ-R-7 show overlap with impulsivity and sensation seeking). Impulsive behaviors may occur when a person does not adequately consider and plan how their behavior may have behavioral and mental consequences (e.g., emotions, thoughts) for themselves and others (i.e., hypomentalization). In addition, in the presence of mentalizing impairments, impulsive and unrealistic assumptions about the thoughts and feelings of others may often be raised (i.e. hypermentalization) [17]. Moreover, negative urgency-related characteristics are considered as forms of impulsivity as well as reflective functioning difficulties according to the RFQ-R-7 [11]. Individuals with high sensation seeking can also present impairments in terms of emotion regulation, specifically problems in regulating positive emotions [16]. That is, general emotion regulation difficulties also may account for the positive link between mentalization difficulties and externalizing personality traits of impulsivity and sensation seeking.
Both rumination and worry can be conceptualized as maladaptive emotion regulation forms as they can show positive relationships with adverse outcomes of mental health [24,25]. Therefore, the co-occurrence of high levels of rumination, worry and mentalization difficulties is in accordance with those previous findings which showed positive correlations between difficulties in emotion regulation, maladaptive emotion regulation strategies and impairments in reflective functioning [2,[10][11][12][13][14]. The positive correlations between these variables might be accounted for the overlap between the items of the RFQ-R-7 and general emotion regulation difficulties (e.g., the item of "People's thoughts are a mystery to me"), whereas other items of the RFQ-R-7 show similar characteristics to rumination and worry (e.g., the item of "Strong feelings often cloud my thinking"). Furthermore, a possible explanation for the positive associations between mentalizing difficulties and repetitive negative thinking styles (i.e., rumination and worry) may be that affected individuals may have a general difficulty thinking about mental states. For example, previous research has indicated that metacognitive beliefs may be associated with the use of maladaptive emotion regulation strategies such as rumination and worry [47,48]. Metacognitive beliefs (e.g., positive beliefs about the benefits of using worry and rumination, negative beliefs about the uncontrollability and dangers of thoughts) overlap with the concept of mentalization: they also allow understanding of others' and one's own mental states and their motivations, and foster taking the perspective of others [49].
The validation analyses of the RFQ-R-7 also revealed significant quadratic (non-linear) effects, for example, with sensation seeking and worry in both samples. However, the significant quadratic relationships closely followed the linear associations between the variables (e.g., suggesting marginal plateau and threshold effects). In line with this, only marginal increases were shown in the level of explained variance of mentalization difficulties due to the inclusion of the quadratic effects compared with the model containing only the linear predictive effects. In line with previous literature, these findings may suggest that the general mentalization difficulties factor in the unidimensional structure of the RFQ-8 and the RFQ-R-7 can indicate a continuum between low and high levels of hypomentalization (uncertainty), while hypermentalization (certainty) is measured insufficiently in these scales [11,22]. That is, based on the assumed unidimensional structure of the RFQ-R-7, it may be possible that an approximately linear relationship describes best the relationship with maladaptive psychological outcomes (e.g., as the level of hypomentalization increases, the levels of maladaptive outcomes also increases) [11]. In other words, increasing scores on items of the RFQ-R-7 rated on the original seven-point scale may indicate increasing hypomentalization, meaning that low scores may represent more adaptive psychological functioning, while higher scores may imply more maladaptive psychological functioning. On the other hand, the general mentalizing difficulties (hypomentalization) factor measured by the RFQ-R-7 version is probably not suitable to test the Fonagy et al.'s concept of the bipolar continuum of mentalizing difficulties [1]. Namely, a non-linear relationship can be hypothesized with maladaptive psychological outcomes: on some of the items low scores can represent hypermentalization (certainty), whereas high scores can indicate hypomentalization (uncertainty); and those with high hyper-and hypomentalization can show an increased risk for maladaptive psychological characteristics [1,11].

Limitations
Cautious interpretation of the present findings is warranted due to methodological limitations. First, the cross-sectional design of the study did not allow to test longitudinal invariance of the confirmed unidimensional structure of the questionnaire as well as causal relationships between mentalization difficulties and the validating psychological variables. Second, due to the applied self-report measurement it might be possible that differences in mentalization difficulties were not captured accurately (e.g., automatic and fluctuating mentalization processes), whereas it is also possible that social desirability effects influenced the participants' responses. Third, the RFQ-8 and the RFQ-R-7 only provided a superficial assessment for reflective functioning (e.g., measuring mostly hypomentalization tendencies regarding the individual's own behavior), thus it was not possible to explore the multidimensional nature of mentalization (i.e., other mentalization questionnaires were not included in the study questionnaires). That is, the present study only provided data specifically for the dimensionality of the RFQ-8 and not for mentalization difficulties in general. Moreover, the present study missed to analyze the association between the RFQ-R-7 and other, mentalization-based measures. Fourth, it is also possible that relevant psychological predictors were not included in the multivariate MIMIC analyses which might influenced the findings (e.g., psychopathological symptom levels of borderline personality disorder or major depressive disorder, other emotion regulation strategies and personality traits). Finally, total scale scores were calculated for each psychological predictor of mentalization difficulties. It might be possible that by using subscale scores (e.g., cognitive and behavioral impulsivity, impatience/restlessness in the case of the BIS-R-21), more detailed associations would have been revealed between difficulties in reflective functioning and the other psychological constructs.

Conclusions
In line with previous literature data, the present study suggested that it is preferable to use the original response scale of the RFQ-8 to measure the level of difficulties in reflective functioning [11,22]. Moreover, CFA underlined that more precise measurement of mentalization difficulties can be obtained by using, a modified, seven-item version of the brief RFQ, without Item 7 (RFQ-R-7). A unidimensional model with a general mentalization difficulties factor provided the most optimal level of model fit and psychometric characteristics. This model indicated that future studies might consider using the total scale score. However, the interpretation of the low and high scores on the general mentalization difficulties factor (i.e., the total score of the RFQ-R-7) might be equivocal. It might be possible that this dimension predominantly captures difficulties of hypomentalization (uncertainty), thus low and high scores on the RFQ-R-7 represent the absence and presence of hypomentalization difficulties, respectively. Alternatively, in line with the original conceptualization of the scoring of the RFQ-8, low and high scores on the RFQ-R-7 might represent high rates of hypermentalization (certainty) and hypomentalization (uncertainty), respectively.
Overall, the present findings suggested that the RFQ-8 and the RFQ-R-7 provide measurement for hypomentalization difficulties, whereas the construct of hypermentalization is assessed only limitedly with these set of items. Specifically, associations between mentalization difficulties and maladaptive psychological characteristics pointed out that low scores on the RFQ-R-7 are associated with more adaptive psychological characteristics (e.g., lower levels of impulsivity, sensation seeking, rumination, worry). That is, in accordance with previous findings [11,22], these data might indicate that lower scores on some of the items of the RFQ most likely are not signs of a maladaptive, certainty-related reflective functioning (i.e., hypermentalization). In other words, it might possible that the total score on the RFQ-R-7 captures a dimension of hypomentalization ranging between low and high levels of difficulties regarding uncertainty. Previous findings also revealed that the measurement of hypermentalization is only limitedly possible by using the RFQ-8 [11,22]. Therefore, future studies might consider revising the brief version of the RFQ to contain items which are more suitable to measure hypermentalization.